Skip to main content

Dual Comparison

The Dual Comparison tab allows users to evaluate two different validations side by side. This comparison helps identify changes in model performance across different datasets, annotation sets, or model versions. The interface includes metrics, confusion matrices, and performance distributions to support deep analysis.


1. Validation Selection

Users can select two validations:

  • Base Validation: The primary reference point
  • Compared Validation: The second run for comparison

Each selection displays metadata, including:

  • Model Name & Type
  • Validation Name
  • Dataset & Modality
  • Annotation Set
  • Sample Count (Total, Done, Failed)
  • Tags and Source Info

2. Key Performance Metrics

Side-by-side metrics help users assess both models at a glance:

  • Accuracy
  • True Positives (TP)
  • True Negatives (TN)
  • False Positives (FP)
  • False Negatives (FN)
  • Sensitivity
  • Specificity
  • Precision

3. Class Performance

This section compares the performance of each class between validations:

  • Per-class Accuracy
  • Sensitivity and Specificity
  • Precision and F1 Score

Charts highlight any performance degradation or improvement between versions or datasets.


4. Confidence Threshold Evaluation

Users can compare how both models behave at different confidence thresholds:

  • Performance metrics shift as thresholds change
  • Useful for threshold tuning and model calibration

5. Distribution Visualizations

Several comparative charts are available:

  • Prediction Distribution: Number of predicted cases per class
  • Confidence Distribution: How confident each model was for each class
  • Dataset Class Distribution: Breakdown of label frequencies in each dataset
  • ROC & Precision-Recall Curves: Model discrimination ability across thresholds
  • Lift Chart: (if available) Measures class separation efficiency

6. Confusion Matrix

Confusion matrices for both validations are shown side-by-side:

  • Helps identify class confusion
  • Highlights misclassification patterns

7. Mismatched Predictions

This table highlights samples where the two models disagreed:

  • Useful for error analysis
  • Reveals model-specific blind spots
  • Enables targeted review of problematic predictions

✅ Tip: Use Dual Comparison to validate changes between model versions, track dataset impact, and gain confidence in model updates before deployment.